Skip to content

feat: add Map and Set sanitization via sanitizeCollections option#305

Merged
ioncache merged 6 commits into
mainfrom
worktree-feat+map-set-sanitization
May 25, 2026
Merged

feat: add Map and Set sanitization via sanitizeCollections option#305
ioncache merged 6 commits into
mainfrom
worktree-feat+map-set-sanitization

Conversation

@ioncache
Copy link
Copy Markdown
Owner

@ioncache ioncache commented May 25, 2026

Overview

Adds opt-in Map and Set sanitization via a new sanitizeCollections option (default false). When enabled, objectReplacer traverses each collection and returns a new sanitized copy — the original is never mutated.

Details

  • sanitizeCollections?: boolean added to DataSanitizationReplacerOptions
  • Map entries: string keys are matched against field-name patterns; matched values are masked or removed. Object keys are recursed into and sanitized. String keys themselves are not sanitized (consistent with plain object property name behaviour).
  • Set values: each item is recursed through sanitizeValue — strings are scanned for embedded patterns, objects are sanitized recursively.
  • Defaults to false so existing pass-through behaviour for non-plain objects is preserved.
  • README: new "Sanitize Maps and Sets" usage subsection with serialization tip (Maps/Sets are not JSON-serializable by default), updated options table, and How it works note on key-sanitization limitation.
  • Two benchmark cases added to bench/sanitize-data.bench.ts (Map shallow + Set small).

Related Tickets and/or Pull Requests

Relates to plan 012

Checklist

  • Tests added or updated
  • README and TSDoc updated if the public API changed
  • Breaking changes called out (if any)
  • Roadmap item checked off if this PR completes one

Summary by CodeRabbit

  • New Features

    • Optional support to sanitize Map and Set instances so these collections can be processed like objects/arrays.
  • Documentation

    • Added a "Sanitize Maps and Sets" section with examples and updated the options table and roadmap entries.
  • Tests

    • New test coverage validating Map/Set sanitization behavior and recursive handling.
  • Benchmarks

    • Added benchmark suites for Map and Set sanitization scenarios.

Review Change Stack

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 25, 2026

📝 Walkthrough

Walkthrough

This PR implements an opt-in sanitizeCollections feature that enables sanitizing Map and Set instances by traversing their entries, applying sensitive-pattern matching and masking rules to keys and values, and returning new sanitized copies. The implementation includes type additions, core recursive logic, tests, benchmarks, and README/roadmap/docs updates.

Changes

Map and Set Sanitization Feature

Layer / File(s) Summary
Type contract and option definition
src/types.ts
DataSanitizationReplacerOptions gains a new optional sanitizeCollections?: boolean property to toggle Map/Set sanitization behavior.
Map and Set sanitization logic
src/replacers.ts
objectReplacer adds sanitizeCollections parameter handling; the recursive sanitizeValue function detects Map and Set values, iterates entries, matches keys/values against patterns, applies masking/removal for sensitive matches, recursively sanitizes non-sensitive entries, and returns new cloned Map/Set instances.
Test coverage for Map and Set
test/replacers.test.ts
Comprehensive test suites for Map and Set sanitization covering pass-through vs. cloning behavior, key-based masking/removal, numeric masking, string value pattern scanning/remasking, recursive sanitization of nested objects and collections, and Map-in-Set interactions.
README documentation and examples
README.md
Adds Table of Contents entry, new "Sanitize Maps and Sets" section with Map/Set usage examples and JSON serialization notes, updates "How it works" section to describe sanitizeCollections: true behavior, and adds sanitizeCollections row to options table.
Performance benchmarks
bench/sanitize-data.bench.ts
Adds benchmark documentation bullet and two new sanitizeData benchmark suites: shallow Map with a sensitive entry and small Set with an embedded-pattern string, both exercising sanitizeCollections: true (Map includes scanStringValues: false variant).
Feature plan and roadmap status
docs/plans/012-map-set-sanitization.md, docs/ROADMAP.md
New plan document specifies sanitization semantics for Map keys, masking rules, and recursive behavior; roadmap table updated to mark Map and Set as implemented via sanitizeCollections: true, and collection usage-signal guidance was rewritten.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 A rabbit hops through Maps and Sets so neat,
Flag in paw, it keeps your secrets discreet.
Keys masked, values trimmed with care,
Nesting handled everywhere—
Collections tidy, now the data's sweet!

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the main feature: adding Map and Set sanitization via a new sanitizeCollections option.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch worktree-feat+map-set-sanitization

Comment @coderabbitai help to get the list of available commands and usage tips.

@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 25, 2026

Coverage Report

Status Category Percentage Covered / Total
🔵 Lines 100% (🎯 100%) 171 / 171
🔵 Statements 100% (🎯 100%) 175 / 175
🔵 Functions 100% (🎯 100%) 22 / 22
🔵 Branches 100% (🎯 100%) 119 / 119
File Coverage
File Stmts Branches Functions Lines Uncovered Lines
Changed Files
src/replacers.ts 100% 100% 100% 100%
Generated in workflow #224 for commit c8b3a16 by the Vitest Coverage Report Action

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
test/replacers.test.ts (1)

1128-1128: ⚡ Quick win

Use DEFAULT_PATTERN_MASK instead of hardcoded mask literals.

These two assertions hardcode '**********', which makes tests brittle if the default mask changes.

Proposed fix
-        expect(sanitized.get('message')).toBe('api_key=**********');
+        expect(sanitized.get('message')).toBe(
+          `api_key=${DEFAULT_PATTERN_MASK}`,
+        );

-        expect(sanitized.has('api_key=**********')).toBe(true);
+        expect(
+          sanitized.has(`api_key=${DEFAULT_PATTERN_MASK}`),
+        ).toBe(true);

Also applies to: 1231-1231

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/replacers.test.ts` at line 1128, The test is asserting a hardcoded mask
string ('**********') which should use the shared DEFAULT_PATTERN_MASK constant
to avoid brittleness; update the two assertions that check
sanitized.get('message') (and the other occurrence at the second mentioned
location) to compare against DEFAULT_PATTERN_MASK instead of the literal
'**********' so the test follows the current default mask value (import or
reference DEFAULT_PATTERN_MASK in tests if not already imported).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@docs/ROADMAP.md`:
- Around line 147-148: Update the "Collect Usage Signals" section in ROADMAP.md
to remove or rephrase the guidance that says you must gather usage signals
before planning Map/Set support, since Map and Set are now marked implemented in
the table; locate the table rows referencing "Map" and "Set" and the "Collect
Usage Signals" heading and change the phrasing to acknowledge Map/Set are
implemented (for example, note they are implemented and optional via
sanitizeCollections: true or suggest monitoring post-release usage instead of
pre-planning signal collection).

---

Nitpick comments:
In `@test/replacers.test.ts`:
- Line 1128: The test is asserting a hardcoded mask string ('**********') which
should use the shared DEFAULT_PATTERN_MASK constant to avoid brittleness; update
the two assertions that check sanitized.get('message') (and the other occurrence
at the second mentioned location) to compare against DEFAULT_PATTERN_MASK
instead of the literal '**********' so the test follows the current default mask
value (import or reference DEFAULT_PATTERN_MASK in tests if not already
imported).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 831f4423-c8d4-4009-9dae-6d101b51e606

📥 Commits

Reviewing files that changed from the base of the PR and between cfac143 and 845420d.

📒 Files selected for processing (7)
  • README.md
  • bench/sanitize-data.bench.ts
  • docs/ROADMAP.md
  • docs/plans/012-map-set-sanitization.md
  • src/replacers.ts
  • src/types.ts
  • test/replacers.test.ts

Comment thread docs/ROADMAP.md
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (2)
test/replacers.test.ts (2)

1194-1274: 💤 Low value

Optionally add tests for removeMatches and scanStringValues: false with Sets.

For additional coverage completeness:

  1. removeMatches with Set string scanning — When a Set contains 'api_key=hunter2' and removeMatches: true, verify the behavior (likely produces empty string from the scan-and-remove logic).

  2. scanStringValues: false with Set — Verify that when scanStringValues: false, string items in Sets are not scanned for embedded patterns, consistent with plain object behavior (tested at line 956-973).

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/replacers.test.ts` around lines 1194 - 1274, Add two new tests under the
existing "Set sanitization" describe that call objectReplacer with
sanitizeCollections: true and reference DEFAULT_PATTERN_MASK: one test sets
removeMatches: true and a Set containing 'api_key=hunter2' and asserts the Set
item is removed or becomes the expected empty/removed-result per the
scan-and-remove logic (use removeMatches flag passed to objectReplacer); the
other test sets scanStringValues: false with a Set containing 'api_key=hunter2'
and asserts the string is left unchanged (i.e., still 'api_key=hunter2'),
mirroring the behavior asserted for plain objects; locate usages near the other
Set tests so they use the same patterns and assertions.

1036-1274: ⚡ Quick win

Consider adding tests for circular references and non-plain object preservation.

The Map and Set test coverage is solid for the main scenarios. However, two important edge cases are missing that align with the coding guidelines:

  1. Circular reference detection — The implementation uses WeakSet to detect circular references and throws TypeError. Add tests like:

    const circularMap = new Map();
    circularMap.set('self', circularMap);
    expect(() => objectReplacer({ data: circularMap }, { sanitizeCollections: true }))
      .toThrow(TypeError);
  2. Non-plain object preservation in collections — Per guidelines, non-plain objects (e.g., Date, custom classes) should pass through unchanged. The test at line 754-796 verifies this when sanitizeCollections is false, but not when it's true. Add tests like:

    const date = new Date('2024-01-01');
    const map = new Map([['timestamp', date], ['username', 'mark']]);
    const result = objectReplacer({ map }, { sanitizeCollections: true });
    const sanitized = result.map as Map<string, unknown>;
    expect(sanitized.get('timestamp')).toBe(date); // same instance

As per coding guidelines, object replacers must detect circular references and preserve non-plain object instances.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/replacers.test.ts` around lines 1036 - 1274, Add two tests for
collection handling: (1) verify circular reference detection by creating a
Map/Set that contains itself and asserting objectReplacer({ data: circular }, {
sanitizeCollections: true }) throws a TypeError (this exercises the
WeakSet-based cycle detection inside objectReplacer); (2) verify non-plain
objects are preserved when sanitizeCollections is true by putting a Date (or
custom class instance) into a Map/Set alongside a normal key and asserting the
sanitized collection contains the exact same Date instance (not cloned or
modified) while other sensitive entries are masked; place these alongside the
existing Map/Set tests referencing objectReplacer and the sanitizeCollections
option.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/replacers.test.ts`:
- Around line 1194-1274: Add two new tests under the existing "Set sanitization"
describe that call objectReplacer with sanitizeCollections: true and reference
DEFAULT_PATTERN_MASK: one test sets removeMatches: true and a Set containing
'api_key=hunter2' and asserts the Set item is removed or becomes the expected
empty/removed-result per the scan-and-remove logic (use removeMatches flag
passed to objectReplacer); the other test sets scanStringValues: false with a
Set containing 'api_key=hunter2' and asserts the string is left unchanged (i.e.,
still 'api_key=hunter2'), mirroring the behavior asserted for plain objects;
locate usages near the other Set tests so they use the same patterns and
assertions.
- Around line 1036-1274: Add two tests for collection handling: (1) verify
circular reference detection by creating a Map/Set that contains itself and
asserting objectReplacer({ data: circular }, { sanitizeCollections: true })
throws a TypeError (this exercises the WeakSet-based cycle detection inside
objectReplacer); (2) verify non-plain objects are preserved when
sanitizeCollections is true by putting a Date (or custom class instance) into a
Map/Set alongside a normal key and asserting the sanitized collection contains
the exact same Date instance (not cloned or modified) while other sensitive
entries are masked; place these alongside the existing Map/Set tests referencing
objectReplacer and the sanitizeCollections option.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 6c57f9b6-ec29-4f2b-aa7b-7785e092978d

📥 Commits

Reviewing files that changed from the base of the PR and between 845420d and c8b3a16.

📒 Files selected for processing (2)
  • docs/ROADMAP.md
  • test/replacers.test.ts

@ioncache ioncache merged commit 83cc928 into main May 25, 2026
5 checks passed
@ioncache ioncache deleted the worktree-feat+map-set-sanitization branch May 25, 2026 10:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant